bandicoot: a toolbox for mobile phone metadata
نویسندگان
چکیده
bandicoot is an open-source Python toolbox to help data scientists analyze mobile phone metadata. With only a few lines of code, bandicoot loads your datasets, visualizes your data, performs analyses, and exports the results. Every time we send or received a text or a phone call, our mobile phones generate metadata: who we call, at what time, for how long, and from where. Collected at large scale, they have been used to design transportation systems, planning disaster responses, and fight epidemics. [1] While the use of machine learning algorithms on mobile phone metadata has been evolving fast, it currently lacks the standardization needed to thrive. Numerous crucial implementation choices are often lost from one research paper to another making it hard to replicate results, to quantify the impact of new methods, and to transfer knowledge. We have introduced bandicoot, an open-source Python toolbox, to solve these issues. bandicoot extracts more than 160 robust behavioral features from mobile phone metadata, and focuses on making it easy for researchers and practitioners to load metadata and compute robust features from them. bandicoot indicators fall into three categories (see Figure 1): 1. Individual indicators (e.g. percent of nocturnal interactions, time it takes someone to answer text messages) describe an individual’s phone usage. 2. Spatial indicators (e.g. entropy of visited antennas, radius of gyration) describe mobility patterns. 3. Social network indicators (e.g. clustering coefficient, assortativity) describe individuals’ social network and compare their behaviors with those of their contacts. Emphasis is put on correctness and consistency through numerous unit tests covering 91% of the source code, domain-specific code detecting incorrect entries, and reporting variables to assess data quality or potential data issues. Figure 1: bandicoot uses individual, spatial, and social features. bandicoot is currently used at large scale experiments by carriers (e.g. Orange, Telenor), NGOs, and international organizations (World Bank). bandicoot’s behavioral indicators were, for instance, used to predict users’ personality traits [3] the Big Five indicators (neuroticism, extraversion, etc.) resulting in accuracies up to 70% better than random. Similar methodology has recently been used to predict the gender of users in low and medium income countries Figure. 2 (reproduced from [2]) shows that a training set of 10.000 people is enough to reach an accuracy ranging from 74,3% to 88,4%. Figure 2: Gender prediction reaches 74,3% accuracy with a training size of 10,000 people. During the “Data for Development” challenge, an international research challenge [4] using a massive anonymized dataset provided by telecommunication company Orange, bandicoot was used to address socio-economic development question in Ivory Coast. Best contributions explored disease containment for country-wise epidemics, social divisions, or optimization of public transportation.
منابع مشابه
bandicoot: a Python Toolbox for Mobile Phone Metadata
bandicoot is an open-source Python toolbox to extract more than 1442 features from standard mobile phone metadata. bandicoot makes it easy for machine learning researchers and practitioners to load mobile phone data, to analyze and visualize them, and to extract robust features which can be used for various classification and clustering tasks. Emphasis is put on ease of use, consistency, and do...
متن کاملD4D-Senegal: The Second Mobile Phone Data for Development Challenge
The D4D-Senegal challenge is an open innovation data challenge on anonymous call patterns of Orange’s mobile phone users in Senegal. The goal of the challenge is to help address society development questions in novel ways by contributing to the socio-economic development and well-being of the Senegalese population. Participants to the challenge are given access to three mobile phone datasets. T...
متن کاملMedia Content Metadata and Mobile Picture Sharing
This paper describes two systems for picture taking with mobile phone cameras. The first system, MMM, is a platform for generating semantically rich metadata for mobile pictures at the time of image capture. The second system, MobShare, is a mobile picture sharing and discussing system that takes advantage of the temporal and social information in the camera phone. The paper also discusses comb...
متن کاملUsing Deep Learning to Predict Demographics from Mobile Phone Metadata
Mobile phone metadata are increasingly used to study human behavior at largescale. There has recently been a growing interest in predicting demographic information from metadata. Previous approaches relied on hand-engineered features. We here apply, for the first time, deep learning methods to mobile phone metadata using a convolutional network. Our method provides high accuracy on both age and...
متن کاملDesigning User- Centric Metadata for Digital Snapshot Photography
The amount of personal digital media is increasing, and managing it has become a pressing problem. Effective management of media content is not possible without content-related metadata. In this paper we describe a content metadata creation process for images taken with a mobile phone. The design goals were to automate the creation of image content metadata by leveraging automatically available...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017